A Bit-serial Architecture for 1-d Multiplierless Dct
نویسنده
چکیده
The Discrete Cosine Transform (DCT) is significantly of interest in the area of image compression according to its high compaction energy. It has become the core of many international standards such as JPEG, H.26x and the MPEG family [1-3]. In both software and hardware implementations, there appear many fast algorithms to speed up the computation of DCT. A 2-D DCT can be easily computed by recursively used of a 1-D DCT computing scheme. However, the direct implementation of 2-D DCT is generally requires more efforts. Most DCT computations require floating point multiplications, which indeed slow and clumsy. Such a mathematical notation can be avoided by using Integer implementations, which are usually based on distributed arithmetic [4]. However it stills accompany some drawbacks since these fixed-point multiplications need rather wide data bus (32-bit, for instance). This can lead to a limitation of low power applications such as handheld devices. Based on Chen's factorisation of the DCT matrix [5], Tran et al. [6-7] have proposed an approximation computation of DCT by introducing the lifting scheme. The basis multiplication is approximated by the rationals of the form k/2m, which can be implemented efficiently by binary shifts. This multiplierless type DCT is also known as a binary DCT or bin DCT. Both the forward and the inverse transforms can be implemented in the similar manner. The implementation can be made further less complicated and more regular by making used of the scaled DCT and in-place computation. Our work concerns the implementation of a 1-D DCT based on Liang and Tran's work [7]. For very low bit rate applications with quite high compression gain, we treated our design by making used of bit-serial computation scheme. The resulted design is compact and low power consumption. In section 2, we will outline the fast DCT techniques. A multiplierless approximation binDCT proposed by [7] is greatly reviewed. Our approximation is detailed in section 3. Effects of different word-length computation were also investigated. In section 4, we demonstrated the use of a bit-serial architecture in the implementation of such a binDCT. The simulation results of both software and hardware are given.
منابع مشابه
A New Optimum-Word-Length-Assignment (OWLA) Multiplierless Integer DCT for Unified Lossless/Lossy Image Coding
Recently, we proposed a new multiplierless 1D Int-DCT modified from our existing Int-DCT by approximating floating multiplications to bit-shift and addition operations. The multiplierless 1D Int-DCT can be well operated both lossless coding and lossy coding. However, our multiplierless 1D Int-DCT is not focused on how to assign word-length for floating-multiplier approximation as short as possi...
متن کاملFast multiplierless approximations of the DCT with the lifting scheme
In this paper, we present the design, implementation, and application of several families of fast multiplierless approximations of the discrete cosine transform (DCT) with the lifting scheme called the binDCT. These binDCT families are derived from Chen’s and Loeffler’s plane rotation-based factorizations of the DCT matrix, respectively, and the design approach can also be applied to a DCT of a...
متن کاملBit Serial Architecture for the Two-Dimensional DCT
We present an architecture for the calculation of the Two Dimensional Discrete Cosine Transform and its Inverse that admits a high data rate. It is based on the row-column decomposition, the use of a fast algorithm, serial digit arithmetic and redundant coding. The critical path is set by the delay of a multiplexer plus a binary adder with as many digits as the width of the serial digits to be ...
متن کاملForward and Inverse 2-d Dct Architectures Targeting Hdtv for H.264/avc Video Compression Standard
−− This paper presents the architecture and the VHDL design of the integer TwoDimensional Discrete Cosine Transform (2-D DCT) used in the H.264/AVC codecs. The forward and inverse 2-D DCT architectures were designed and their synthesis results mapped to Altera FPGAs are presented. The 2-D DCT calculation is performed by exploring the separability property, in such way, each 2-D DCT architecture...
متن کاملMultiplierless Approximate 4-point DCT VLSI Architectures for Transform Block Coding
Two multiplierless algorithms are proposed for 4×4 approximate-DCT for transform coding in digital video. Computational architectures for 1-D/2-D realisations are implemented using Xilinx FPGA devices. CMOS synthesis at the 45 nm node indicate real-time operation at 1 GHz yielding 4×4 block rates of 125 MHz at less than 120 mW of dynamic power consumption.
متن کامل